{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# 开源词嵌入向量实验\n",
    "\n",
    "1. tencent ailab\n",
    "    - 读取腾讯词典，导入切词工具\n",
    "    - 将预训练词向量导入模型词嵌入层\n",
    "    \n",
    "2. google glove\n",
    "    - 读取词典导入切词工具\n",
    "    - 将预训练词向量导入模型词嵌入层"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# 1. Tencent AI-lib \n",
    "\n",
    "下载数据：[https://ai.tencent.com/ailab/nlp/embedding.html](https://ai.tencent.com/ailab/nlp/embedding.html)\n",
    "\n",
    "Tencent AI Lab Embedding Corpus for Chinese Words and Phrases\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## 1.1 读取腾讯词典，导入切词工具\n",
    "\n",
    "1. 生成词典\n",
    "\n",
    "生成词典文件`tencent.bin`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "from tqdm import tqdm\n",
    "\n",
    "def gendict(inputFile, ouputFile):\n",
    "    output_f = open(ouputFile, 'ab')\n",
    "    with open(inputFile, \"r\", encoding='ISO-8859-1') as f:\n",
    "        header = f.readline()\n",
    "        vocab_size, vector_size = map(int, header.split())\n",
    "        for i in tqdm(range(vocab_size)):\n",
    "            line = f.readline()\n",
    "            lists = line.split(' ')\n",
    "            word = lists[0]\n",
    "            try: \n",
    "                word = word.encode('ISO-8859-1').decode('utf8')\n",
    "                output_f.write((word+'\\n').encode('utf8'))\n",
    "            except: pass\n",
    "        output_f.close()\n",
    "        f.close()\n",
    "\n",
    "\n",
    "inputfile = 'E:\\\\Desktop\\\\nlp\\\\Tencent_AILab_ChineseEmbedding.txt'\n",
    "outputfile = 'E:\\\\Desktop\\\\nlp\\\\tencent.bin'\n",
    "#gendict(inputfile, outputfile)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "2. 读取词典\n",
    "\n",
    "将词典文`tencent.bin`导入分词工具。\n",
    "- jieba\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Building prefix dict from the default dictionary ...\n",
      "Loading model from cache C:\\Users\\Hongwen\\AppData\\Local\\Temp\\jieba.cache\n",
      "Loading model cost 0.787 seconds.\n",
      "Prefix dict has been built succesfully.\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "['我',\n",
       " '今天',\n",
       " '吃',\n",
       " '了',\n",
       " '西红柿',\n",
       " '炒面',\n",
       " '，',\n",
       " '隔壁',\n",
       " '的',\n",
       " '人',\n",
       " '也',\n",
       " '是',\n",
       " '因吹斯',\n",
       " '听',\n",
       " '的',\n",
       " '人',\n",
       " '。']"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import jieba\n",
    "jieba.lcut('我今天吃了西红柿炒面，隔壁的人也是因吹斯听的人。')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|█████████████████████████████████████████████████████████████████████| 9046401/9046401 [04:05<00:00, 36854.88it/s]\n"
     ]
    }
   ],
   "source": [
    "import jieba\n",
    "from tqdm import tqdm\n",
    "jieba.add_word('开始')\n",
    "\n",
    "def load_userdict(f):\n",
    "    f = open(f, 'r', encoding='utf8')\n",
    "    data = f.readlines()\n",
    "    for i in tqdm(range(len(data))):\n",
    "        word = data[i].strip('\\n')\n",
    "        jieba.add_word(word)\n",
    "\n",
    "load_userdict('E:\\\\Desktop\\\\nlp\\\\tencent.bin')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['我', '今天', '吃了', '西红柿炒面', '，', '隔壁', '的人', '也是', '因吹斯听', '的人', '。']"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "jieba.lcut('我今天吃了西红柿炒面，隔壁的人也是因吹斯听的人。')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## 1.2 将读取到的词向量载入模型embedding层\n",
    "\n",
    "- **embeddingFile:** `Tencent_AILab_ChineseEmbedding.txt`\n",
    "- **word2id:** GenData.ch2id\n",
    "- **embeddingSize:** 200\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [],
   "source": [
    "from tqdm import tqdm\n",
    "\n",
    "def loadEmbedding(embeddingFile, word2id, embeddingSize):\n",
    "    with open(embeddingFile, \"r\", encoding='ISO-8859-1') as f:\n",
    "        header = f.readline()\n",
    "        vocab_size, vector_size = map(int, header.split())\n",
    "        initW = np.random.uniform(-0.25,0.25,(len(word2id), vector_size))\n",
    "        count = 0\n",
    "        print('loadding embedding data from tencent ailab ...')\n",
    "        for i in tqdm(range(vocab_size)):\n",
    "            line = f.readline()\n",
    "            lists = line.split(' ')\n",
    "            word = lists[0]\n",
    "            try: word = word.encode('ISO-8859-1').decode('utf8')\n",
    "            except: pass\n",
    "            if word in word2id:\n",
    "                count += 1\n",
    "                number = map(float, lists[1:])\n",
    "                number = list(number)\n",
    "                vector = np.array(number)\n",
    "                initW[word2id[word]] = vector\n",
    "        print(count)\n",
    "        return initW"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "loadding embedding data from tencent ailab ...\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|█████████████████████████████████████████████████████████████████████| 8825658/8825658 [02:28<00:00, 59376.01it/s]\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "239\n",
      "(248, 200)\n",
      "WARNING:tensorflow:From d:\\ProgramData\\Anaconda3\\lib\\site-packages\\tensorflow\\python\\util\\tf_should_use.py:118: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.\n",
      "Instructions for updating:\n",
      "Use `tf.global_variables_initializer` instead.\n",
      "[ 0.21811423 -0.09034403 -0.02901633  0.15603746 -0.07284238  0.09341685\n",
      "  0.2026991  -0.071945   -0.04600149  0.18145561  0.179964   -0.15550348\n",
      " -0.16100383 -0.15658209 -0.13439766 -0.23189929  0.19315624 -0.04714389\n",
      " -0.00332712  0.1696456   0.20265692  0.18811639 -0.20272435  0.03150773\n",
      "  0.22014199 -0.1266679  -0.02821003 -0.01831185  0.11978181 -0.22922224\n",
      " -0.00780576  0.04416671 -0.22243042 -0.22626986  0.15837246 -0.22912757\n",
      "  0.22339202 -0.06263614 -0.1612309   0.10037225 -0.09283496 -0.0659591\n",
      " -0.09785461 -0.07583669  0.21064714 -0.22052026  0.21821363  0.0054501\n",
      " -0.19645277  0.18829118  0.0121442  -0.17428291  0.03204966 -0.16647246\n",
      " -0.09764255  0.01065319 -0.12923214 -0.24241629  0.00064717 -0.07459965\n",
      "  0.10835728 -0.18958226 -0.17542696 -0.07144472  0.24785972  0.14723393\n",
      " -0.13104634 -0.11659891 -0.09243217 -0.23772074  0.20870206  0.214679\n",
      "  0.2485326   0.21340017 -0.16217211 -0.13884827 -0.15664385 -0.06153126\n",
      "  0.02621304  0.12916087 -0.1479906   0.05407315  0.20713732  0.03044557\n",
      "  0.22280912 -0.01860463  0.23086224  0.19328775 -0.01556212  0.17192642\n",
      " -0.16302966 -0.05875675  0.11707024  0.05971114  0.05123375 -0.05971126\n",
      " -0.1925253  -0.15465224  0.1778621  -0.19318928 -0.04985993  0.21712136\n",
      " -0.12959671  0.02902289  0.20249118 -0.21336818  0.06835651  0.00707481\n",
      " -0.17025088 -0.20839171 -0.10769601  0.13543367 -0.0667306   0.10250122\n",
      "  0.231176   -0.00975712  0.14727914 -0.01647162 -0.18928911 -0.01918438\n",
      "  0.00690698 -0.16899234  0.14422534 -0.00910255  0.05275811  0.1708545\n",
      " -0.16241977  0.07157063 -0.23228233 -0.09265122  0.20558178  0.17864028\n",
      "  0.18983217  0.05956117  0.10848432 -0.03656358 -0.00831608 -0.16322224\n",
      "  0.15948777 -0.01928804 -0.06069737  0.09936493 -0.21968307  0.09509666\n",
      "  0.16683944  0.14843409  0.207535   -0.17898336 -0.22258636  0.05494418\n",
      " -0.21697517 -0.19227426 -0.10380355  0.17544311  0.08881292 -0.10407088\n",
      " -0.20320007  0.05610187 -0.12847398 -0.1459277  -0.06831509  0.20240517\n",
      " -0.03383845  0.0262997   0.1216047  -0.01861086  0.24923159  0.16468446\n",
      "  0.11820894  0.16421059 -0.11077356 -0.07750682  0.07538304 -0.11342348\n",
      " -0.07986517  0.01898898  0.1562226  -0.21976621  0.05024231 -0.07999938\n",
      " -0.09349404  0.15813684 -0.15759294  0.16468425 -0.14688027  0.19030395\n",
      " -0.22462486  0.05833448  0.1615059   0.16884853 -0.07930598 -0.03536789\n",
      "  0.1996768  -0.2334278  -0.23911378  0.21687704 -0.09867963 -0.00460315\n",
      "  0.17447759  0.24485265]\n"
     ]
    }
   ],
   "source": [
    "import tensorflow as tf\n",
    "import numpy as np\n",
    "# 包含数据处理函数\n",
    "from utils import GenData\n",
    "\n",
    "    \n",
    "data = GenData('cmn.txt','jieba',200)\n",
    "weights = loadEmbedding('E:\\\\Desktop\\\\nlp\\\\Tencent_AILab_ChineseEmbedding.txt', data.ch2id, 200)\n",
    "print(weights.shape)\n",
    "\n",
    "\n",
    "tf.reset_default_graph()\n",
    "embedding = tf.get_variable('embedding', [len(data.ch2id), 200])\n",
    "\n",
    "with tf.Session() as sess:\n",
    "    sess.run(tf.initialize_all_variables())\n",
    "    sess.run(embedding.assign(weights))\n",
    "    print(sess.run(embedding[0]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# 2. GOOGLE Glove\n",
    "\n",
    "下载数据：[https://nlp.stanford.edu/projects/glove/](https://nlp.stanford.edu/projects/glove/)\n",
    "\n",
    "## Introduction\n",
    "\n",
    "GloVe is an unsupervised learning algorithm for obtaining vector representations for words. Training is performed on aggregated global word-word co-occurrence statistics from a corpus, and the resulting representations showcase interesting linear substructures of the word vector space."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.1 读取谷歌词典，导入切词工具\n",
    "\n",
    "1. 读取词典\n",
    "\n",
    "生成词典文件`google.bin`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|█████████████████████████████████████████████████████████████████████| 2196017/2196017 [01:06<00:00, 32984.01it/s]\n"
     ]
    }
   ],
   "source": [
    "from tqdm import tqdm\n",
    "\n",
    "def gendict(inputFile, ouputFile):\n",
    "    output_f = open(ouputFile, 'ab')\n",
    "    with open(inputFile, \"r\", encoding='ISO-8859-1') as f:\n",
    "        data = f.readlines()\n",
    "        for i in tqdm(range(len(data))):\n",
    "            line = data[i]\n",
    "            lists = line.split(' ')\n",
    "            word = lists[0]\n",
    "            try: \n",
    "                word = word.encode('ISO-8859-1').decode('utf8')\n",
    "                output_f.write((word+'\\n').encode('utf8'))\n",
    "            except: pass\n",
    "        output_f.close()\n",
    "        f.close()\n",
    "\n",
    "gendict('E:\\\\Desktop\\\\nlp\\\\glove.840B.300d.txt', 'E:\\\\Desktop\\\\nlp\\\\google.bin')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "2. 读取词典\n",
    "\n",
    "将词典文`tencent.bin`导入分词工具。\n",
    "- jieba\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "load userdict to jieba dict ...\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|█████████████████████████████████████████████████████████████████████| 2196017/2196017 [02:05<00:00, 17502.50it/s]\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['Hi', ',', ' ', 'my', ' ', 'name', ' ', 'is', ' ', 'sun', 'hongwen', '.']\n"
     ]
    }
   ],
   "source": [
    "import jieba\n",
    "\n",
    "\n",
    "def load_userdict(f):\n",
    "    f = open(f, 'r', encoding='utf8')\n",
    "    data = f.readlines()\n",
    "    print('load userdict to jieba dict ...')\n",
    "    for i in tqdm(range(len(data))):\n",
    "        word = data[i].strip('\\n')\n",
    "        jieba.add_word(word)\n",
    "\n",
    "load_userdict('E:\\\\Desktop\\\\nlp\\\\google.bin')\n",
    "\n",
    "print(jieba.lcut('Hi, my name is sunhongwen.'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['If', 'a', 'person', 'has', 'not', 'had', 'a', 'chance', 'to', 'acquire', 'his', 'target', 'language', 'by', 'the', 'time', 'he', \"'\", 's', 'an', 'adult', ',', 'he', \"'\", 's', 'unlikely', 'to', 'be', 'able', 'to', 'reach', 'native', 'speaker', 'level', 'in', 'that', 'language.']\n"
     ]
    }
   ],
   "source": [
    "print([char for char in jieba.lcut(\n",
    "        'If a person has not had a chance \\\n",
    "        to acquire his target language by \\\n",
    "        the time he\\'s an adult, he\\'s unlikely \\\n",
    "        to be able to reach native speaker level \\\n",
    "        in that language.') if char != ' '])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.2 将读取到的词向量载入模型embedding层\n",
    "\n",
    "- **embeddingFile:** `glove.840B.300d.txt.txt`\n",
    "- **word2id:** GenData.en2id\n",
    "- **embeddingSize:** 200"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from tqdm import tqdm\n",
    "# 包含数据处理函数\n",
    "\n",
    "def loadEmbedding(embeddingFile, word2id, embeddingSize):\n",
    "    with open(embeddingFile, \"r\", encoding='ISO-8859-1') as f:\n",
    "        data = f.readlines()\n",
    "        initW = np.random.uniform(-0.25,0.25,(len(word2id), embeddingSize))\n",
    "        count = 0\n",
    "        for i in tqdm(range(len(data))):\n",
    "            line = data[i]\n",
    "            lists = line.split(' ')\n",
    "            word = lists[0]\n",
    "            try: word = word.encode('ISO-8859-1').decode('utf8')\n",
    "            except: pass\n",
    "            if word in word2id:\n",
    "                count += 1\n",
    "                number = map(float, lists[1:])\n",
    "                number = list(number)\n",
    "                vector = np.array(number)\n",
    "                initW[word2id[word]] = vector\n",
    "        print(count)\n",
    "        return initW"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'<PAD>': 0, '<UNK>': 1, '!': 2, \"'\": 3, ',': 4, '.': 5, '3': 6, '30.': 7, ':': 8, '?': 9, 'All': 10, 'Am': 11, 'Ask': 12, 'Back': 13, 'Be': 14, 'Birds': 15, 'Call': 16, 'Can': 17, 'Catch': 18, 'Cheers': 19, 'Come': 20, 'Cuff': 21, 'DJ.': 22, 'Definitely': 23, 'Do': 24, 'Dogs': 25, 'Don': 26, 'Drive': 27, 'Excuse': 28, 'Feel': 29, 'Fill': 30, 'Follow': 31, 'Get': 32, 'Go': 33, 'God': 34, 'Good': 35, 'Goodbye': 36, 'Grab': 37, 'Hands': 38, 'Hang': 39, 'Have': 40, 'He': 41, 'Hello': 42, 'Help': 43, 'Hey': 44, 'Hi.': 45, 'Hold': 46, 'Hop': 47, 'How': 48, 'Hug': 49, 'Humor': 50, 'Hurry': 51, 'I': 52, 'Is': 53, 'It': 54, 'Join': 55, 'Keep': 56, 'Kiss': 57, 'Leave': 58, 'Let': 59, 'Lie': 60, 'Listen': 61, 'Look': 62, 'Move': 63, 'No': 64, 'OK.': 65, 'Of': 66, 'Oh': 67, 'Open': 68, 'Perfect': 69, 'Read': 70, 'Really': 71, 'Run': 72, 'See': 73, 'She': 74, 'Shut': 75, 'Sit': 76, 'Skip': 77, 'Slow': 78, 'Stand': 79, 'Stay': 80, 'Stop': 81, 'Take': 82, 'They': 83, 'Tom': 84, 'Tom.': 85, 'Trust': 86, 'Try': 87, 'Turn': 88, 'Wait': 89, 'Wake': 90, 'Wash': 91, 'We': 92, 'Welcome.': 93, 'Well': 94, 'Who': 95, 'Why': 96, 'Wonderful': 97, 'You': 98, 'a': 99, 'aboard': 100, 'above.': 101, 'agree.': 102, 'along.': 103, 'am': 104, 'am.': 105, 'away': 106, 'away.': 107, 'awful': 108, 'back': 109, 'bark': 110, 'busy.': 111, 'calm': 112, 'came.': 113, 'can': 114, 'care': 115, 'care.': 116, 'cares': 117, 'cold.': 118, 'cook.': 119, 'course': 120, 'course.': 121, 'cried.': 122, 'cry.': 123, 'd.': 124, 'died': 125, 'died.': 126, 'done': 127, 'down': 128, 'down.': 129, 'eat': 130, 'envy': 131, 'exists.': 132, 'fair.': 133, 'far': 134, 'fine.': 135, 'fire': 136, 'fire.': 137, 'fly.': 138, 'forgot.': 139, 'free.': 140, 'full.': 141, 'fun.': 142, 'gave': 143, 'get': 144, 'go': 145, 'hard.': 146, 'hate': 147, 'help': 148, 'her.': 149, 'here.': 150, 'him.': 151, 'home': 152, 'home.': 153, 'hope': 154, 'idiot': 155, 'ill.': 156, 'in.': 157, 'is': 158, 'it': 159, 'it.': 160, 'kind.': 161, 'know.': 162, 'late.': 163, 'laughed.': 164, 'lazy.': 165, 'left.': 166, 'll': 167, 'lost': 168, 'lost.': 169, 'lovely': 170, 'luck.': 171, 'm': 172, 'man.': 173, 'me': 174, 'me.': 175, 'mean.': 176, 'move': 177, 'move.': 178, 'nice.': 179, 'night.': 180, 'no': 181, 'not': 182, 'now.': 183, 'off.': 184, 'okay.': 185, 'old.': 186, 'on': 187, 'on.': 188, 'out': 189, 'over.': 190, 'pay.': 191, 'please': 192, 'poor.': 193, 'promise.': 194, 'quit.': 195, 'ran': 196, 'real.': 197, 'relax.': 198, 'resign': 199, 'right.': 200, 'run.': 201, 'runs.': 202, 's': 203, 's.': 204, 'saw': 205, 'short.': 206, 'sick.': 207, 'sing.': 208, 'snowed': 209, 'so.': 210, 'some.': 211, 'sorry.': 212, 'still.': 213, 'sw': 214, 'swim.': 215, 'swims': 216, 't': 217, 'tall.': 218, 'that': 219, 'that.': 220, 'this.': 221, 'tight.': 222, 'tried.': 223, 'tries': 224, 'true.': 225, 'try.': 226, 'up': 227, 'up.': 228, 'us.': 229, 'walk': 230, 'wave': 231, 'way': 232, 'won': 233, 'won.': 234, 'wrong': 235, 'you.': 236, 'young.': 237}\n"
     ]
    }
   ],
   "source": [
    "from utils import GenData\n",
    "\n",
    "\n",
    "def main():\n",
    "    data = GenData('cmn.txt','jieba',200)\n",
    "    print(data.en2id)\n",
    "    weight = loadEmbedding('E:\\\\Desktop\\\\nlp\\\\glove.840B.300d.txt', data.en2id, 300)\n",
    "    print(weight.shape)\n",
    "\n",
    "main()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "if you are\n"
     ]
    }
   ],
   "source": [
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "celltoolbar": "Slideshow",
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
